AITopics | hyperparameter combination

Collaborating Authors

hyperparameter combination

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

da1a97b53eec1c763c6d06835538fe3e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 06:13:08 GMT

dataset, djective, quantization scale, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Montgomery County > Bethesda (0.05)
Europe > Germany > Saxony > Leipzig (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

da1a97b53eec1c763c6d06835538fe3e-Supplemental-Conference.pdf

Neural Information Processing SystemsAug-19-2025, 08:41:04 GMT

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland > Montgomery County > Bethesda (0.05)
Europe > Germany > Saxony > Leipzig (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Industry: Health & Medicine (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Survey of Active Learning Hyperparameters: Insights from a Large-Scale Experimental Grid

Gonsior, Julius, Rieß, Tim, Reusch, Anja, Hartmann, Claudio, Thiele, Maik, Lehner, Wolfgang

arXiv.org Artificial IntelligenceJun-5-2025

Annotating data is a time-consuming and costly task, but it is inherently required for supervised machine learning. Active Learning (AL) is an established method that minimizes human labeling effort by iteratively selecting the most informative unlabeled samples for expert annotation, thereby improving the overall classification performance. Even though AL has been known for decades, AL is still rarely used in real-world applications. As indicated in the two community web surveys among the NLP community about AL, two main reasons continue to hold practitioners back from using AL: first, the complexity of setting AL up, and second, a lack of trust in its effectiveness. We hypothesize that both reasons share the same culprit: the large hyperparameter space of AL. This mostly unexplored hyperparameter space often leads to misleading and irreproducible AL experiment results. In this study, we first compiled a large hyperparameter grid of over 4.6 million hyperparameter combinations, second, recorded the performance of all combinations in the so-far biggest conducted AL study, and third, analyzed the impact of each hyperparameter in the experiment results. In the end, we give recommendations about the influence of each hyperparameter, demonstrate the surprising influence of the concrete AL strategy implementation, and outline an experimental study design for reproducible AL experiments with minimal computational effort, thus contributing to more reproducible and trustworthy AL research in the future.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2506.03817

Country:

Europe (1.00)
North America > Canada (0.93)
North America > United States > Wisconsin (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimation

Wang, Yihang, Chi, Chris, Dinner, Aaron R.

arXiv.org Machine LearningMay-7-2025

Normalizing flows (NFs) provide uncorrelated samples from complex distributions, making them an appealing tool for parameter estimation. However, the practical utility of NFs remains limited by their tendency to collapse to a single mode of a multimodal distribution. In this study, we show that annealing with an adaptive schedule based on the effective sample size (ESS) can mitigate mode collapse. We demonstrate that our approach can converge the marginal likelihood for a biochemical oscillator model fit to time-series data in ten-fold less computation time than a widely used ensemble Markov chain Monte Carlo (MCMC) method. We show that the ESS can also be used to reduce variance by pruning the samples. We expect these developments to be of general use for sampling with NFs and discuss potential opportunities for further improvements.

artificial intelligence, machine learning, target distribution, (16 more...)

arXiv.org Machine Learning

2505.03652

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Comprehensive Benchmarking of Machine Learning Methods for Risk Prediction Modelling from Large-Scale Survival Data: A UK Biobank Study

Oexner, Rafael R., Schmitt, Robin, Ahn, Hyunchan, Shah, Ravi A., Zoccarato, Anna, Theofilatos, Konstantinos, Shah, Ajay M.

arXiv.org Artificial IntelligenceMar-11-2025

Predictive modelling is vital to guide preventive efforts. Whilst large-scale prospective cohort studies and a diverse toolkit of available machine learning (ML) algorithms have facilitated such survival task efforts, choosing the best-performing algorithm remains challenging. Benchmarking studies to date focus on relatively small-scale datasets and it is unclear how well such findings translate to large datasets that combine omics and clinical features. We sought to benchmark eight distinct survival task implementations, ranging from linear to deep learning (DL) models, within the large-scale prospective cohort study UK Biobank (UKB). We compared discrimination and computational requirements across heterogenous predictor matrices and endpoints. Finally, we assessed how well different architectures scale with sample sizes ranging from n = 5,000 to n = 250,000 individuals. Our results show that discriminative performance across a multitude of metrices is dependent on endpoint frequency and predictor matrix properties, with very robust performance of (penalised) COX Proportional Hazards (COX-PH) models. Of note, there are certain scenarios which favour more complex frameworks, specifically if working with larger numbers of observations and relatively simple predictor matrices. The observed computational requirements were vastly different, and we provide solutions in cases where current implementations were impracticable. In conclusion, this work delineates how optimal model choice is dependent on a variety of factors, including sample size, endpoint frequency and predictor matrix properties, thus constituting an informative resource for researchers working on similar datasets. Furthermore, we showcase how linear models still display a highly effective and scalable platform to perform risk modelling at scale and suggest that those are reported alongside non-linear ML models.

endpoint, input feature combination, predictor matrix, (15 more...)

arXiv.org Artificial Intelligence

2503.0887

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > Wales (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Research Report > Strength Medium (0.91)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.70)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)

Add feedback

A hierarchical approach for assessing the vulnerability of tree-based classification models to membership inference attack

Preen, Richard J., Smith, Jim

arXiv.org Artificial IntelligenceFeb-13-2025

Machine learning models can inadvertently expose confidential properties of their training data, making them vulnerable to membership inference attacks (MIA). While numerous evaluation methods exist, many require computationally expensive processes, such as training multiple shadow models. This article presents two new complementary approaches for efficiently identifying vulnerable tree-based models: an ante-hoc analysis of hyperparameter choices and a post-hoc examination of trained model structure. While these new methods cannot certify whether a model is safe from MIA, they provide practitioners with a means to significantly reduce the number of models that need to undergo expensive MIA assessment through a hierarchical filtering approach. More specifically, it is shown that the rank order of disclosure risk for different hyperparameter combinations remains consistent across datasets, enabling the development of simple, human-interpretable rules for identifying relatively high-risk models before training. While this ante-hoc analysis cannot determine absolute safety since this also depends on the specific dataset, it allows the elimination of unnecessarily risky configurations during hyperparameter tuning. Additionally, computationally inexpensive structural metrics serve as indicators of MIA vulnerability, providing a second filtering stage to identify risky models after training but before conducting expensive attacks. Empirical results show that hyperparameter-based risk prediction rules can achieve high accuracy in predicting the most at risk combinations of hyperparameters across different tree-based model types, while requiring no model training. Moreover, target model accuracy is not seen to correlate with privacy risk, suggesting opportunities to optimise model configurations for both performance and privacy.

artificial intelligence, hyperparameter combination, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.09396

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Bristol (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Government (1.00)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

An Interpretable ML-based Model for Predicting p-y Curves of Monopile Foundations in Sand

Li, Biao, Song, Qing-Kai, Qi, Wen-Gang, Gao, Fu-Ping

arXiv.org Artificial IntelligenceJan-7-2025

Predicting the lateral pile response is challenging due to the complexity of pile-soil interactions. Machine learning (ML) techniques have gained considerable attention for their effectiveness in non-linear analysis and prediction. This study develops an interpretable ML-based model for predicting p-y curves of monopile foundations. An XGBoost model was trained using a database compiled from existing research. The results demonstrate that the model achieves superior predictive accuracy. Shapley Additive Explanations (SHAP) was employed to enhance interpretability. The SHAP value distributions for each variable demonstrate strong alignment with established theoretical knowledge on factors affecting the lateral response of pile foundations.

artificial intelligence, machine learning, xgboost model, (15 more...)

arXiv.org Artificial Intelligence

2501.06232

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Extending TWIG: Zero-Shot Predictive Hyperparameter Selection for KGEs based on Graph Structure

Sardina, Jeffrey, Kelleher, John D., O'Sullivan, Declan

arXiv.org Artificial IntelligenceDec-19-2024

Knowledge Graphs (KGs) have seen increasing use across various domains -- from biomedicine and linguistics to general knowledge modelling. In order to facilitate the analysis of knowledge graphs, Knowledge Graph Embeddings (KGEs) have been developed to automatically analyse KGs and predict new facts based on the information in a KG, a task called "link prediction". Many existing studies have documented that the structure of a KG, KGE model components, and KGE hyperparameters can significantly change how well KGEs perform and what relationships they are able to learn. Recently, the Topologically-Weighted Intelligence Generation (TWIG) model has been proposed as a solution to modelling how each of these elements relate. In this work, we extend the previous research on TWIG and evaluate its ability to simulate the output of the KGE model ComplEx in the cross-KG setting. Our results are twofold. First, TWIG is able to summarise KGE performance on a wide range of hyperparameter settings and KGs being learned, suggesting that it represents a general knowledge of how to predict KGE performance from KG structure. Second, we show that TWIG can successfully predict hyperparameter performance on unseen KGs in the zero-shot setting. This second observation leads us to propose that, with additional research, optimal hyperparameter selection for KGE models could be determined in a pre-hoc manner using TWIG-like methods, rather than by using a full hyperparameter search.

large language model, machine learning, twig, (15 more...)

arXiv.org Artificial Intelligence

2412.14801

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.61)

Add feedback

Hybrid deep additive neural networks

Kim, Gyu Min, Jeon, Jeong Min

arXiv.org Machine LearningDec-6-2024

Traditional neural networks (multi-layer perceptrons) have become an important tool in data science due to their success across a wide range of tasks. However, their performance is sometimes unsatisfactory, and they often require a large number of parameters, primarily due to their reliance on the linear combination structure. Meanwhile, additive regression has been a popular alternative to linear regression in statistics. In this work, we introduce novel deep neural networks that incorporate the idea of additive regression. Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions. Additionally, we introduce several hybrid neural networks that combine this architecture with that of traditional neural networks. We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application. The numerical results indicate that our neural networks generally achieve better performance than traditional neural networks while using fewer parameters.

artificial intelligence, machine learning, poly 0, (19 more...)

arXiv.org Machine Learning

2411.09175

Country:

North America > United States > California (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Towards Fair and Rigorous Evaluations: Hyperparameter Optimization for Top-N Recommendation Task with Implicit Feedback

Fang, Hui, Feng, Xu, Qin, Lu, Sun, Zhu

arXiv.org Artificial IntelligenceAug-14-2024

The widespread use of the internet has led to an overwhelming amount of data, which has resulted in the problem of information overload. Recommender systems have emerged as a solution to this problem by providing personalized recommendations to users based on their preferences and historical data. However, as recommendation models become increasingly complex, finding the best hyperparameter combination for different models has become a challenge. The high-dimensional hyperparameter search space poses numerous challenges for researchers, and failure to disclose hyperparameter settings may impede the reproducibility of research results. In this paper, we investigate the Top-N implicit recommendation problem and focus on optimizing the benchmark recommendation algorithm commonly used in comparative experiments using hyperparameter optimization algorithms. We propose a research methodology that follows the principles of a fair comparison, employing seven types of hyperparameter search algorithms to fine-tune six common recommendation algorithms on three datasets. We have identified the most suitable hyperparameter search algorithms for various recommendation algorithms on different types of datasets as a reference for later study. This study contributes to algorithmic research in recommender systems based on hyperparameter optimization, providing a fair basis for comparison.

algorithm, hyperparameter, recommendation algorithm, (8 more...)

arXiv.org Artificial Intelligence

2408.0763

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > Italy > Veneto > Venice (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback